Search CORE

107 research outputs found

Fast and effective kernels for relational learning from texts

Author: Moschitti A
Zanzotto Fm
Publication venue: ACM
Publication date: 01/01/2007
Field of study

In this paper, we define a family of syntactic kernels for automatic relational learning from pairs of natural language sentences. We provide an efficient computation of such models by optimizing the dynamic programming algorithm of the kernel evaluation. Experiments with Support Vector Machines and the above kernels show the effectiveness and efficiency of our approach on two very important natural language tasks, Textual Entailment Recognition and Question Answering

ART

Automatic learning of textual entailments with cross-pair similarities

Author: Moschitti A
Zanzotto Fm
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2006
Field of study

In this paper we define a novel similarity measure between examples of textual entailments and we use it as a kernel function in Support Vector Machines (SVMs). This allows us to automatically learn the rewrite rules that describe a non trivial set of entailment cases. The experiments with the data sets of the RTE 2005 challenge show an improvement of 4.4% over the state-of-the-art methods

CiteSeerX

ART

Structured lexical similarity via convolution Kernels on dependency trees

Author: Basili R
Croce D
Moschitti A
Publication venue: Association for computational linguistics
Publication date: 01/01/2011
Field of study

A central topic in natural language process-ing is the design of lexical and syntactic fea-tures suitable for the target application. In this paper, we study convolution dependency tree kernels for automatic engineering of syntactic and semantic patterns exploiting lexical simi-larities. We define efficient and powerful ker-nels for measuring the similarity between de-pendency structures, whose surface forms of the lexical nodes are in part or completely dif-ferent. The experiments with such kernels for question classification show an unprecedented results, e.g. 41 % of error reduction of the for-mer state-of-the-art. Additionally, semantic role classification confirms the benefit of se-mantic smoothing for dependency kernels.

CiteSeerX

ART

A Machine learning approach to textual entailment recognition

Author: Moschitti A
Pennacchiotti M
Zanzotto Fm
Publication venue: Cambridge University Press
Publication date
Field of study

Designing models for learning textual entailment recognizers from annotated examples is not an easy task, as it requires modeling the semantic relations and interactions involved between two pairs of text fragments. In this paper, we approach the problem by first introducing the class of pair feature spaces, which allow supervised machine learning algorithms to derive first-order rewrite rules from annotated examples. In particular, we propose syntactic and shallow semantic feature spaces, and compare them to standard ones. Extensive experiments demonstrate that our proposed spaces learn first-order derivations, while standard ones are not expressive enough to do so

ART

Machine learning for emergent middleware

Author: A. Heß
A. Moschitti
A. Moschitti
A. Moschitti
D. Angluin
D. Peled
D.A. Cohn
F. Aarts
F. Howar
F. Howar
F. Howar
G.S. Blair
H. Raffelt
I. Katakis
M. Merten
O. Grinchtein
T. Joachims
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/08/2012
Field of study

Highly dynamic and heterogeneous distributed systems are challenging today's middleware technologies. Existing middleware paradigms are unable to deliver on their most central promise, which is offering interoperability. In this paper, we argue for the need to dynamically synthesise distributed system infrastructures according to the current operating environment, thereby generating "Emergent Middleware'' to mediate interactions among heterogeneous networked systems that interact in an ad hoc way. The paper outlines the overall architecture of Enablers underlying Emergent Middleware, and in particular focuses on the key role of learning in supporting such a process, spanning statistical learning to infer the semantics of networked system functions and automata learning to extract the related behaviours of networked systems

Crossref

INRIA a CCSD electronic archive server

Open Research Online (The Open University)

Supervised semantic relation mining from linguistically noisy text documents

Author: Basili R
Giannone C
Moschitti A
Naggar P
Publication venue: Springer Verlag
Publication date: 01/01/2011
Field of study

ART

Cross-language frame semantics transfer in bilingual corpora

Author: A. Moschitti
C.J. Fillmore
D. Gildea
L. Heyer
M. Palmer
T. Landauer
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Abstract. Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-English European languages. These works are based on the assumption that parallel corpora annotated for English can be used to transfer the semantic information to the other target languages. In this paper, a robust method based on a statistical machine translation step augmented with simple rule-based post-processing is presented. It alleviates problems related to preprocessing errors and the complex optimization required by syntax-dependent models of the cross-lingual mapping. Different alignment strategies are here in-vestigated against the Europarl corpus. Results suggest that the quality of the de-rived annotations is surprisingly good and well suited for training semantic role labeling systems.

CiteSeerX

Crossref

ART

Thread-level information for comment classification in community question answering

Author: Barron-Cedeno A.
Da San Martino G.
Filice S.
Joty S.
Marquez L.
Moschitti A.
Nakov P.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

Community Question Answering (cQA) is a new application of QA in social contexts (e.g., fora). It presents new interesting challenges and research directions, e.g., exploiting the dependencies between the different comments of a thread to select the best answer for a given question. In this paper, we explored two ways of modeling such dependencies: (i) by designing specific features looking globally at the thread; and (ii) by applying structure prediction models. We trained and evaluated our models on data from SemEval-2015 Task 3 on Answer Selection in cQA. Our experiments show that: (i) the thread-level features consistently improve the performance for a variety of machine learning models, yielding state-of-the-art results; and (ii) sequential dependencies between the answer labels captured by structured prediction models are not enough to improve the results, indicating that more information is needed in the joint model

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Tree similarity measurement for classifying questions by syntactic structures

Author: A Moschitti
B Croft
C Elzinga
D Croce
J Shawe-Taylor
K Zhang
M Mittendorfer
Z Lin
Z Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/07/2016
Field of study

Queen's University Belfast Research Portal

Crossref

Ulster University's Research Portal

Linguistic and statistically derived features for cause of death prediction from verbal autopsy text

Author: A. Moschitti
A.M. Cohen
C.J.L. Murray
E. Loper
G. King
K. Kahn
M. Gamon
P. Byass
P.D. Turney
S. Matsumoto
S. Pakhomov
T. Dunning
W.N. Francis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Automatic Text Classification (ATC) is an emerging technology with economic importance given the unprecedented growth of text data. This paper reports on work in progress to develop methods for predicting Cause of Death from Verbal Autopsy (VA) documents recommended for use in low-income countries by the World Health Organisation. VA documents contain both coded data and open narrative. The task is formulated as a Text Classification problem and explores various combinations of linguistic and statistical approaches to determine how these may improve on the standard bag-of-words approach using a dataset of over 6400 VA documents that were manually annotated with cause of death. We demonstrate that a significant improvement of prediction accuracy can be obtained through a novel combination of statistical and linguistic features derived from the VA text. The paper explores the methods by which ATC may leads to improved accuracy in Cause of Death prediction

Crossref

White Rose Research Online